318 ◾ Bioinformatics
The assemblies’ evaluation report contains important statistics that reflect the quality of
the assemblies in the file. Figure 8.4 shows partial evaluation report of the three samples.
The colored heatmap indicates the quality from the worst (red) to the best (blue). On the
top, there are links that can take to each sample graph as shown in Figure 8.5. The graphs
show the key identified bacterial taxa and their abundance. Refer to the program users’
manual, which is available at “http://cab.cc.spbu.ru/quast/manual.html”, to read more
about the program use and the different report sections, refer to Chapter 3 to read more
about the de novo assembly evaluation metrics.
8.2.6 Mapping Reads to the Assemblies
We have already created the metagenomic FASTQ files, which are produced from the reads
unmapped to the host reference genome. Then, we used a de novo assembler to produce an
assembly (scaffolds.fasta) for each sample that may contain genomic sequences of several
microbes. In this step, we will use an aligner to map the reads in the FASTQ files to the
respective assembly. For this purpose, we can use Bowtie2 aligner. First, we need to build
an index for the “scaffolds.fasta” for each sample and then we will use it to align the reads
in FASTQ files. Now, let us create a directory named “assemblies” in the main project
directory and copy scaffolds FASTA files from the three sample directories into this new
directory with new file names as follows:
FIGURE 8.4 An assemblies’ evaluation report generated with metaquast.py.